Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.626
Filtrar
Mais filtros











Intervalo de ano de publicação
1.
Database (Oxford) ; 20242024 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-38713861

RESUMO

Cancer immunotherapy has brought about a revolutionary breakthrough in the field of cancer treatment. Immunotherapy has changed the treatment landscape for a variety of solid and hematologic malignancies. To assist researchers in efficiently uncovering valuable information related to cancer immunotherapy, we have presented a manually curated comprehensive database called DIRMC, which focuses on molecular features involved in cancer immunotherapy. All the content was collected manually from published literature, authoritative clinical trial data submitted by clinicians, some databases for drug target prediction such as DrugBank, and some experimentally confirmed high-throughput data sets for the characterization of immune-related molecular interactions in cancer, such as a curated database of T-cell receptor sequences with known antigen specificity (VDJdb), a pathology-associated TCR database (McPAS-TCR) et al. By constructing a fully connected functional network, ranging from cancer-related gene mutations to target genes to translated target proteins to protein regions or sites that may specifically affect protein function, we aim to comprehensively characterize molecular features related to cancer immunotherapy. We have developed the scoring criteria to assess the reliability of each MHC-peptide-T-cell receptor (TCR) interaction item to provide a reference for users. The database provides a user-friendly interface to browse and retrieve data by genes, target proteins, diseases and more. DIRMC also provides a download and submission page for researchers to access data of interest for further investigation or submit new interactions related to cancer immunotherapy targets. Furthermore, DIRMC provides a graphical interface to help users predict the binding affinity between their own peptide of interest and MHC or TCR. This database will provide researchers with a one-stop resource to understand cancer immunotherapy-related targets as well as data on MHC-peptide-TCR interactions. It aims to offer reliable molecular characteristics support for both the analysis of the current status of cancer immunotherapy and the development of new immunotherapy. DIRMC is available at http://www.dirmc.tech/. Database URL: http://www.dirmc.tech/.


Assuntos
Imunoterapia , Neoplasias , Imunoterapia/métodos , Humanos , Neoplasias/imunologia , Neoplasias/genética , Neoplasias/terapia , Receptores de Antígenos de Linfócitos T/imunologia , Receptores de Antígenos de Linfócitos T/genética , Bases de Dados de Proteínas , Interface Usuário-Computador
2.
Microbiome ; 12(1): 46, 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38454512

RESUMO

BACKGROUND: By analyzing the proteins which are the workhorses of biological systems, metaproteomics allows us to list the taxa present in any microbiota, monitor their relative biomass, and characterize the functioning of complex biological systems. RESULTS: Here, we present a new strategy for rapidly determining the microbial community structure of a given sample and designing a customized protein sequence database to optimally exploit extensive tandem mass spectrometry data. This approach leverages the capabilities of the first generation of Quadrupole Orbitrap mass spectrometer incorporating an asymmetric track lossless (Astral) analyzer, offering rapid MS/MS scan speed and sensitivity. We took advantage of data-dependent acquisition and data-independent acquisition strategies using a peptide extract from a human fecal sample spiked with precise amounts of peptides from two reference bacteria. CONCLUSIONS: Our approach, which combines both acquisition methods, proves to be time-efficient while processing extensive generic databases and massive datasets, achieving a coverage of more than 122,000 unique peptides and 38,000 protein groups within a 30-min DIA run. This marks a significant departure from current state-of-the-art metaproteomics methodologies, resulting in broader coverage of the metabolic pathways governing the biological system. In combination, our strategy and the Astral mass analyzer represent a quantum leap in the functional analysis of microbiomes. Video Abstract.


Assuntos
Microbiota , Espectrometria de Massas em Tandem , Humanos , Espectrometria de Massas em Tandem/métodos , Proteômica/métodos , Peptídeos , Bases de Dados de Proteínas
3.
Sci Rep ; 14(1): 3112, 2024 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-38326407

RESUMO

Corticotropin-releasing hormone-binding protein (CRHBP) is involved in many physiological processes. However, it is still unclear what role CRHBP has in tumor immunity and prognosis prediction. Using databases such as the Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO), Tumor Protein Database, Timer Database, and Gene Expression Profiling Interactive Analysis (GEPIA), we evaluated the potential role of CRHBP in diverse cancers. Further research looked into the relationships between CRHBP and tumor survival prognosis, immune infiltration, immune checkpoint (ICP) indicators, tumor mutation burden (TMB), microsatellite instability (MSI), mismatch repair (MMR), DNA methylation, tumor microenvironment (TME), and drug responsiveness. The anticancer effect of CRHBP in liver hepatocellular carcinoma (LIHC) was shown by Western blotting, EdU staining, JC-1 staining, transwell test, and wound healing assays. CRHBP expression is significantly low in the majority of tumor types and is associated with survival prognosis, ICP markers, TMB, and microsatellite instability (MSI). The expression of CRHBP was found to be substantially related to the quantity of six immune cell types, as well as the interstitial and immunological scores, showing that CRHBP has a substantial impact in the TME. We also noticed a link between the IC50 of a number of anticancer medicines and the degree of CRHBP expression. CRHBP-related signaling pathways were discovered using functional enrichment. Cox regression analysis showed that CRHBP expression was an independent prognostic factor for LIHC. CRHBP has a tumor suppressor function in LIHC, according to cell and molecular biology trials. CRHBP has a significant impact on tumor immunity, treatment, and prognosis, and has the potential as a cancer treatment target and prognostic indicator.


Assuntos
Carcinoma Hepatocelular , Neoplasias Hepáticas , Humanos , Carcinoma Hepatocelular/tratamento farmacológico , Carcinoma Hepatocelular/genética , Instabilidade de Microssatélites , Prognóstico , Bases de Dados de Proteínas , Neoplasias Hepáticas/tratamento farmacológico , Neoplasias Hepáticas/genética , Microambiente Tumoral/genética
4.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38377393

RESUMO

MOTIVATION: Eukaryotic linear motifs (ELMs), or Short Linear Motifs, are protein interaction modules that play an essential role in cellular processes and signaling networks and are often involved in diseases like cancer. The ELM database is a collection of manually curated motif knowledge from scientific papers. It has become a crucial resource for investigating motif biology and recognizing candidate ELMs in novel amino acid sequences. Users can search amino acid sequences or UniProt Accessions on the ELM resource web interface. However, as with many web services, there are limitations in the swift processing of large-scale queries through the ELM web interface or API calls, and, therefore, integration into protein function analysis pipelines is limited. RESULTS: To allow swift, large-scale motif analyses on protein sequences using ELMs curated in the ELM database, we have extended the gget suite of Python and command line tools with a new module, gget elm, which does not rely on the ELM server for efficiently finding candidate ELMs in user-submitted amino acid sequences and UniProt Accessions. gget elm increases accessibility to the information stored in the ELM database and allows scalable searches for motif-mediated interaction sites in the amino acid sequences. AVAILABILITY AND IMPLEMENTATION: The manual and source code are available at https://github.com/pachterlab/gget.


Assuntos
Proteínas , Software , Motivos de Aminoácidos , Bases de Dados de Proteínas , Proteínas/química , Sequência de Aminoácidos
5.
Proteomics ; 24(8): e2300084, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38380501

RESUMO

Assigning statistical confidence estimates to discoveries produced by a tandem mass spectrometry proteomics experiment is critical to enabling principled interpretation of the results and assessing the cost/benefit ratio of experimental follow-up. The most common technique for computing such estimates is to use target-decoy competition (TDC), in which observed spectra are searched against a database of real (target) peptides and a database of shuffled or reversed (decoy) peptides. TDC procedures for estimating the false discovery rate (FDR) at a given score threshold have been developed for application at the level of spectra, peptides, or proteins. Although these techniques are relatively straightforward to implement, it is common in the literature to skip over the implementation details or even to make mistakes in how the TDC procedures are applied in practice. Here we present Crema, an open-source Python tool that implements several TDC methods of spectrum-, peptide- and protein-level FDR estimation. Crema is compatible with a variety of existing database search tools and provides a straightforward way to obtain robust FDR estimates.


Assuntos
Algoritmos , Peptídeos , Bases de Dados de Proteínas , Peptídeos/química , Proteínas/análise , Proteômica/métodos
6.
J Ethnopharmacol ; 326: 117959, 2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38423413

RESUMO

ETHNOPHARMACOLOGICAL RELEVANCE: Compound Jixuecao Decoction (CJD) is a traditional Chinese herbal medicine prescribed in China to treat chronic renal failure (CRF). Previous studies have shown that CJD affects cell apoptosis and proliferation. However, the mechanism of its renal protective action has not been characterized. AIM OF THE STUDY: To explore the mechanism(s) underlying the effect of CJD on endoplasmic reticulum stress (ERS) and apoptosis in the treatment of CRF using network pharmacology, molecular docking, molecular dynamics simulations, and in vivo studies. MATERIALS AND METHODS: The compounds comprising CJD were extracted from the Traditional Chinese Medicine Systems Pharmacology Database. A Swiss target prediction database and similarity integration approach were employed to identify potential targets of these components. The GeneCards and DisGeNET databases were used to identify targets associated with CRF, apoptosis, and ERS. The STRING database was employed to analyze the protein-protein interactions (PPIs) associated with drug-disease crossover. A chemical composition-shared target network was established, and critical pathways were identified through gene ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analyses. The Protein Data Bank database was used to search key proteins, while molecular docking and dynamics simulations were performed between the top four CJD active ingredients and proteins involved in apoptosis and ERS in CRF. Subsequent in vivo studies using a 5/6 nephrectomy rat model of CRF were performed to verify the findings. RESULTS: The 80 compounds identified in CJD yielded 875 target genes, of which 216 were potentially related to CRF. PPI network analysis revealed key targets via topology filtering. Enrichment analysis, molecular docking, and molecular dynamics simulation results suggested that CJD primarily targets mitofusin-2 (MFN2), B-cell lymphoma-2 (BCL2), BAX, protein kinase RNA-like ER kinase (PERK), and C/EBP homologous protein (CHOP) during CRF treatment. In vivo, CJD significantly increased the abundance of MFN2, BCL2, and significantly reduced the abundance of BAX, PERK, CHOP proteins in kidney tissues, indicating that CJD could improve apoptosis and ERS in CRF rats. CONCLUSIONS: This study provides evidence that CJD effectively delays CFR through modulation of the MFN2 and PERK-eIF2α-ATF4-CHOP signaling pathways.


Assuntos
Medicamentos de Ervas Chinesas , Falência Renal Crônica , Insuficiência Renal Crônica , Animais , Ratos , Simulação de Acoplamento Molecular , Proteína X Associada a bcl-2 , Estresse do Retículo Endoplasmático , Apoptose , Bases de Dados de Proteínas , Medicina Tradicional Chinesa , Medicamentos de Ervas Chinesas/farmacologia , Medicamentos de Ervas Chinesas/uso terapêutico
7.
BMC Res Notes ; 17(1): 50, 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38365785

RESUMO

OBJECTIVE: The superfamily of protein kinases features a common Protein Kinase-like (PKL) three-dimensional fold. Proteins with PKL structure can also possess enzymatic activities other than protein phosphorylation, such as AMPylation or glutamylation. PKL proteins play a vital role in the world of living organisms, contributing to the survival of pathogenic bacteria inside host cells, as well as being involved in carcinogenesis and neurological diseases in humans. The superfamily of PKL proteins is constantly growing. Therefore, it is crucial to gather new information about PKL families. RESULTS: To this end, the KINtaro database ( http://bioinfo.sggw.edu.pl/kintaro/ ) has been created as a resource for collecting and sharing such information. KINtaro combines protein sequence information and additional annotations for more than 70 PKL families, including 32 families not associated with PKL superfamily in established protein domain databases. KINtaro is searchable by keywords and by protein sequence and provides family descriptions, sequences, sequence alignments, HMM models, 3D structure models, experimental structures with PKL domain annotations and sequence logos with catalytic residue annotations.


Assuntos
Proteínas Quinases , Proteínas , Humanos , Proteínas Quinases/genética , Fosforilação , Sequência de Aminoácidos , Alinhamento de Sequência , Bases de Dados de Proteínas
8.
BMC Med Genomics ; 17(1): 2, 2024 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-38167072

RESUMO

BACKGROUND: Lymphangiogenesis plays an important role in tumor progression and is significantly associated with tumor immune infiltration. However, the role and mechanisms of lymphangiogenesis in colorectal cancer (CRC) are still unknown. Thus, the objective is to identify the lymphangiogenesis-related genes associated with immune infiltration and investigation of their prognosis value. METHODS: mRNA expression profiles and corresponding clinical information of CRC samples were obtained from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. The lymphangiogenesis-related genes (LymRGs) were collected from the Molecular Signatures database (MSigDB). Lymphangiogenesis score (LymScore) and immune cell infiltrating levels were quantified using ssGSEA. LymScore) and immune cell infiltrating levels-related hub genes were identified using weighted gene co-expression network analysis (WGCNA). Univariate Cox and LASSO regression analyses were performed to identify the prognostic gene signature and construct a risk model. Furthermore, a predictive nomogram was constructed based on the independent risk factor generated from a multivariate Cox model. RESULTS: A total of 1076 LymScore and immune cell infiltrating levels-related hub genes from three key modules were identified by WGCNA. Lymscore is positively associated with natural killer cells as well as regulator T cells infiltrating. These modular genes were enriched in extracellular matrix and structure, collagen fibril organization, cell-substrate adhesion, etc. NUMBL, TSPAN11, PHF21A, PDGFRA, ZNF385A, and RIMKLB were eventually identified as the prognostic gene signature in CRC. And patients were divided into high-risk and low-risk groups based on the median risk score, the patients in the high-risk group indicated poor survival and were predisposed to metastasis and advanced stages. NUMBL and PHF21A were upregulated but PDGFRA was downregulated in tumor samples compared with normal samples in the Human Protein Atlas (HPA) database. CONCLUSION: Our finding highlights the critical role of lymphangiogenesis in CRC progression and metastasis and provides a novel gene signature for CRC and novel therapeutic strategies for anti-lymphangiogenic therapies in CRC.


Assuntos
Neoplasias Colorretais , Linfangiogênese , Humanos , Linfangiogênese/genética , Biologia Computacional , Bases de Dados de Proteínas , Perfilação da Expressão Gênica , Neoplasias Colorretais/genética , Prognóstico , Tetraspaninas
9.
Aging (Albany NY) ; 16(1): 367-388, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38189809

RESUMO

BACKGROUND: Transmembrane 25(TMEM25) stands out as a potential prognostic biomarker and therapeutic target in the realm of cancer, yet its precise mechanism of action within clear cell renal cell carcinoma (ccRCC) remains unclear. MATERIALS AND METHODS: Gene expression data and clinically relevant information extracted from The Cancer Genome Atlas (TCGA) and Gene expression omnibus (GEO) databases unveil the expression patterns of TMEM25 within renal clear cell carcinoma, which reveals its prognostic and diagnostic significance. The protein expression data is available via the Human Protein Atlas (HPA) database. Further, qPCR experiments conducted on cells and tissues provide strong evidence of the gene's expression status. Additionally, they explore the correlations between TMEM25 expression and DNA methylation, gene mutations, immune cell infiltration, and drug sensitivity within this specific tumor context. RESULTS: At both the RNA and protein levels, TMEM25 displays a noteworthy downregulation in expression, which is consistently linked to an unfavorable prognosis. Receiver Operating Characteristic (ROC) curve analysis, univariate and multivariate Cox regression analyses confirmed the ability of TMEM25 to diagnose and determine prognosis in ccRCC. Its expression related closely with various immune cell types, immune checkpoints, immune inhibitors, and MHC molecules. Within ccRCC tissues, TMEM25 DNA methylation levels are observed to be elevated, and this upregulation is observed across various conditions. TMEM25 mutations also have an impact on the prognosis of ccRCC patients and the results of drug sensitivity analyses are useful for clinical decision-making. CONCLUSIONS: TMEM25 in ccRCC could potentially function as a tumor suppressor gene, holding substantial promise as a novel biomarker for diagnosing, treating, and prognosticating ccRCC patients.


Assuntos
Carcinoma de Células Renais , Carcinoma , Neoplasias Renais , Humanos , Carcinoma de Células Renais/genética , Bases de Dados de Proteínas , Neoplasias Renais/genética , Biomarcadores , Prognóstico
10.
J Proteome Res ; 23(2): 834-843, 2024 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-38252705

RESUMO

In shotgun proteomics, the proteome search engine analyzes mass spectra obtained by experiments, and then a peptide-spectra match (PSM) is reported for each spectrum. However, most of the PSMs identified are incorrect, and therefore various postprocessing software have been developed for reranking the peptide identifications. Yet these methods suffer from issues such as dependency on distribution, reliance on shallow models, and limited effectiveness. In this work, we propose AttnPep, a deep learning model for rescoring PSM scores that utilizes the Self-Attention module. This module helps the neural network focus on features relevant to the classification of PSMs and ignore irrelevant features. This allows AttnPep to analyze the output of different search engines and improve PSM discrimination accuracy. We considered a PSM to be correct if it achieves a q-value <0.01 and compared AttnPep with existing mainstream software PeptideProphet, Percolator, and proteoTorch. The results indicated that AttnPep found an average increase in correct PSMs of 9.29% relative to the other methods. Additionally, AttnPep was able to better distinguish between correct and incorrect PSMs and found more synthetic peptides in the complex SWATH data set.


Assuntos
Algoritmos , Aprendizado Profundo , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Peptídeos , Software , Bases de Dados de Proteínas
11.
Nucleic Acids Res ; 52(D1): D522-D528, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37956315

RESUMO

The OpenProt proteogenomic resource (https://www.openprot.org/) provides users with a complete and freely accessible set of non-canonical or alternative open reading frames (AltORFs) within the transcriptome of various species, as well as functional annotations of the corresponding protein sequences not found in standard databases. Enhancements in this update are largely the result of user feedback and include the prediction of structure, subcellular localization, and intrinsic disorder, using cutting-edge algorithms based on machine learning techniques. The mass spectrometry pipeline now integrates a machine learning-based peptide rescoring method to improve peptide identification. We continue to help users explore this cryptic proteome by providing OpenCustomDB, a tool that enables users to build their own customized protein databases, and OpenVar, a genomic annotator including genetic variants within AltORFs and protein sequences. A new interface improves the visualization of all functional annotations, including a spectral viewer and the prediction of multicoding genes. All data on OpenProt are freely available and downloadable. Overall, OpenProt continues to establish itself as an important resource for the exploration and study of new proteins.


Assuntos
Bases de Dados de Proteínas , Peptídeos , Proteômica , Sequência de Aminoácidos , Genômica , Internet , Peptídeos/genética , Proteoma/genética , Proteômica/métodos , Humanos
12.
J Proteome Res ; 23(1): 377-385, 2024 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-38091499

RESUMO

Species identification of fragmentary bones remains a challenging task in archeology and forensics. A species identification method for such fragmentary bones that has recently attracted interest is the use of bone collagen proteins. Here, we describe a method similar to DNA barcoding that reads collagen protein sequences in bone and automatically determines the species by performing sequence database searches. The method is almost identical to conventional shotgun proteomics analysis of bone samples, except that the database used by the SEQUEST search engine consisted only of entries for collagen type 1 alpha 2 (COL1A2) proteins from various vertebrates. Accordingly, the COL1A2 peptides that differ in sequence among species act as species marker peptides. In SEQUEST-based shotgun proteomics, the protein entries that contain more marker peptide sequences are assigned higher scores; therefore, the highest-scoring protein entry will be the COL1A2 entry for the species from which the analyzed bone was derived. We tested our method using bone samples from 30 vertebrate species and found that all species were correctly identified. In conclusion, COL1A2 can be used as a bone protein barcode and can be read through shotgun proteomics, allowing for automatic bone species identification. Data are available via ProteomeXchange with the identifier PXD045402.


Assuntos
Proteínas , Proteômica , Animais , Proteômica/métodos , Proteínas/análise , Peptídeos/análise , Sequência de Aminoácidos , Bases de Dados de Proteínas
13.
Proteomics ; 24(6): e2300236, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37706597

RESUMO

Clinical biomarker discovery is often based on the analysis of human plasma samples. However, the high dynamic range and complexity of plasma pose significant challenges to mass spectrometry-based proteomics. Current methods for improving protein identifications require laborious pre-analytical sample preparation. In this study, we developed and evaluated a TMTpro-specific spectral library for improved protein identification in human plasma proteomics. The library was constructed by LC-MS/MS analysis of highly fractionated TMTpro-tagged human plasma, human cell lysates, and relevant arterial tissues. The library was curated using several quality filters to ensure reliable peptide identifications. Our results show that spectral library searching using the TMTpro spectral library improves the identification of proteins in plasma samples compared to conventional sequence database searching. Protein identifications made by the spectral library search engine demonstrated a high degree of complementarity with the sequence database search engine, indicating the feasibility of increasing the number of protein identifications without additional pre-analytical sample preparation. The TMTpro-specific spectral library provides a resource for future plasma proteomics research and optimization of search algorithms for greater accuracy and speed in protein identifications in human plasma proteomics, and is made publicly available to the research community via ProteomeXchange with identifier PXD042546.


Assuntos
Proteômica , Software , Humanos , Proteômica/métodos , Cromatografia Líquida/métodos , Espectrometria de Massas em Tandem/métodos , Peptídeos/análise , Proteínas , Algoritmos , Bases de Dados de Proteínas , Biblioteca de Peptídeos
14.
Proteomics ; 24(5): e2300145, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37726251

RESUMO

Exact p-value (XPV)-based methods for dot product-like score functions-such as the XCorr score implemented in Tide, SEQUEST, Comet or shared peak count-based scoring in MSGF+ and ASPV-provide a fairly good calibration for peptide-spectrum-match (PSM) scoring in database searching-based MS/MS spectrum data identification. Unfortunately, standard XPV methods, in practice, cannot handle high-resolution fragmentation data produced by state-of-the-art mass spectrometers because having smaller bins increases the number of fragment matches that are assigned to incorrect bins and scored improperly. In this article, we present an extension of the XPV method, called the high-resolution exact p-value (HR-XPV) method, which can be used to calibrate PSM scores of high-resolution MS/MS spectra obtained with dot product-like scoring such as the XCorr. The HR-XPV carries remainder masses throughout the fragmentation, allowing them to greatly increase the number of fragments that are properly assigned to the correct bin and, thus, taking advantage of high-resolution data. Using four mass spectrometry data sets, our experimental results demonstrate that HR-XPV produces well-calibrated scores, which in turn results in more trusted spectrum annotations at any false discovery rate level.


Assuntos
Algoritmos , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Software , Peptídeos/química , Calibragem , Bases de Dados de Proteínas
15.
Nucleic Acids Res ; 52(D1): D1062-D1071, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38000392

RESUMO

The SysteMHC Atlas v1.0 was the first public repository dedicated to mass spectrometry-based immunopeptidomics. Here we introduce a newly released version of the SysteMHC Atlas v2.0 (https://systemhc.sjtu.edu.cn), a comprehensive collection of 7190 MS files from 303 allotypes. We extended and optimized a computational pipeline that allows the identification of MHC-bound peptides carrying on unexpected post-translational modifications (PTMs), thereby resulting in 471K modified peptides identified over 60 distinct PTM types. In total, we identified approximately 1.0 million and 1.1 million unique peptides for MHC class I and class II immunopeptidomes, respectively, indicating a 6.8-fold increase and a 28-fold increase to those in v1.0. The SysteMHC Atlas v2.0 introduces several new features, including the inclusion of non-UniProt peptides, and the incorporation of several novel computational tools for FDR estimation, binding affinity prediction and motif deconvolution. Additionally, we enhanced the user interface, upgraded website framework, and provided external links to other resources related. Finally, we built and provided various spectral libraries as community resources for data mining and future immunopeptidomic and proteomic analysis. We believe that the SysteMHC Atlas v2.0 is a unique resource to provide key insights to the immunology and proteomics community and will accelerate the development of vaccines and immunotherapies.


Assuntos
Bases de Dados de Proteínas , Peptídeos , Proteômica , Espectrometria de Massas , Peptídeos/química , Peptídeos/imunologia , Processamento de Proteína Pós-Traducional , Proteômica/métodos , Bases de Dados de Proteínas/normas , Internet , Humanos , Animais
16.
Nucleic Acids Res ; 52(D1): D1289-D1304, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37870473

RESUMO

Tumorigenic functions due to the formation of fusion genes have been targeted for cancer therapeutics (i.e. kinase inhibitors). However, many fusion proteins involved in various cellular processes have not been studied for targeted therapeutics. This is because the lack of complete fusion protein sequences and their whole 3D structures has made it challenging to develop new therapeutic strategies. To fill these critical gaps, we developed a computational pipeline and a resource of human fusion proteins named FusionPDB, available at https://compbio.uth.edu/FusionPDB. FusionPDB is organized into four levels: 43K fusion protein sequences (14.7K in-frame fusion genes, Level 1), over 2300 + 1267 fusion protein 3D structures (from 2300 recurrent and 266 manually curated in-frame fusion genes, Level 2), pLDDT score analysis for the 1267 fusion proteins from 266 manually curated fusion genes (Level 3), and virtual screening outcomes for 68 selected fusion proteins from 266 manually curated fusion genes (Level 4). FusionPDB is the only resource providing whole 3D structures of fusion proteins and comprehensive knowledge of human fusion proteins. It will be regularly updated until it covers all human fusion proteins in the future.


Assuntos
Bases de Dados de Proteínas , Humanos , Sequência de Aminoácidos , Bases de Conhecimento , Neoplasias/genética , Conformação Proteica
17.
Nucleic Acids Res ; 52(D1): D1155-D1162, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37823596

RESUMO

Advancements in mass spectrometry (MS)-based proteomics have greatly facilitated the large-scale quantification of proteins and microproteins, thereby revealing altered signalling pathways across many different cancer types. However, specialized and comprehensive resources are lacking for cancer proteomics. Here, we describe CancerProteome (http://bio-bigdata.hrbmu.edu.cn/CancerProteome), which functionally deciphers and visualizes the proteome landscape in cancer. We manually curated and re-analyzed publicly available MS-based quantification and post-translational modification (PTM) proteomes, including 7406 samples from 21 different cancer types, and also examined protein abundances and PTM levels in 31 120 proteins and 4111 microproteins. Six major analytical modules were developed with a view to describe protein contributions to carcinogenesis using proteome analysis, including conventional analyses of quantitative and the PTM proteome, functional enrichment, protein-protein associations by integrating known interactions with co-expression signatures, drug sensitivity and clinical relevance analyses. Moreover, protein abundances, which correlated with corresponding transcript or PTM levels, were evaluated. CancerProteome is convenient as it allows users to access specific proteins/microproteins of interest using quick searches or query options to generate multiple visualization results. In summary, CancerProteome is an important resource, which functionally deciphers the cancer proteome landscape and provides a novel insight for the identification of tumor protein markers in cancer.


Assuntos
Bases de Dados de Proteínas , Neoplasias , Proteoma , Humanos , Espectrometria de Massas/métodos , Neoplasias/química , Neoplasias/genética , Processamento de Proteína Pós-Traducional , Proteoma/análise , Proteômica/métodos
18.
J Proteome Res ; 23(1): 185-214, 2024 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-38104260

RESUMO

This study describes a new release of the Arabidopsis thaliana PeptideAtlas proteomics resource (build 2023-10) providing protein sequence coverage, matched mass spectrometry (MS) spectra, selected post-translational modifications (PTMs), and metadata. 70 million MS/MS spectra were matched to the Araport11 annotation, identifying ∼0.6 million unique peptides and 18,267 proteins at the highest confidence level and 3396 lower confidence proteins, together representing 78.6% of the predicted proteome. Additional identified proteins not predicted in Araport11 should be considered for the next Arabidopsis genome annotation. This release identified 5198 phosphorylated proteins, 668 ubiquitinated proteins, 3050 N-terminally acetylated proteins, and 864 lysine-acetylated proteins and mapped their PTM sites. MS support was lacking for 21.4% (5896 proteins) of the predicted Araport11 proteome: the "dark" proteome. This dark proteome is highly enriched for E3 ligases, transcription factors, and for certain (e.g., CLE, IDA, PSY) but not other (e.g., THIONIN, CAP) signaling peptides families. A machine learning model trained on RNA expression data and protein properties predicts the probability that proteins will be detected. The model aids in discovery of proteins with short half-life (e.g., SIG1,3 and ERF-VII TFs) and for developing strategies to identify the missing proteins. PeptideAtlas is linked to TAIR, tracks in JBrowse, and several other community proteomics resources.


Assuntos
Arabidopsis , Humanos , Arabidopsis/genética , Arabidopsis/metabolismo , Proteoma/análise , Espectrometria de Massas em Tandem/métodos , Processamento de Proteína Pós-Traducional , Peptídeos/análise , Bases de Dados de Proteínas
19.
J Proteome Res ; 23(2): 574-584, 2024 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-38157563

RESUMO

Accurate and comprehensive peptide precursor ions are crucial to tandem mass-spectrometry-based peptide identification. An identification engine can derive great advantages from the search space reduction enabled by credible and detailed precursors. Furthermore, by considering multiple precursors per spectrum, both the number of identifications and the spectrum explainability can be substantially improved. Here, we introduce PepPre, which detects precursors by decomposing peaks into multiple isotope clusters using linear programming methods. The detected precursors are scored and ranked, and the high-scoring ones are used for subsequent peptide identification. PepPre is evaluated both on regular and cross-linked peptide data sets and compared with 11 methods. The experimental results show that PepPre achieves a remarkable increase of 203% in PSM and 68% in peptide identifications compared to instrument software for regular peptides and 99% in PSM and 27% in peptide pair identifications for cross-linked peptides, surpassing the performance of all other evaluated methods. In addition to the increased identification numbers, further credibility evaluations evidence the reliability of the identified results. Moreover, by widening the isolation window of data acquisition from 2 to 8 Th, with PepPre, an engine is able to identify at least 64% more PSMs, thereby demonstrating the potential advantages of wide-window data acquisition. PepPre is open-source and available at http://peppre.ctarn.io.


Assuntos
Peptídeos , Proteômica , Reprodutibilidade dos Testes , Proteômica/métodos , Software , Espectrometria de Massas em Tandem/métodos , Bases de Dados de Proteínas , Algoritmos
20.
BMC Bioinformatics ; 24(1): 421, 2023 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-37940845

RESUMO

BACKGROUND: In proteomics, the interpretation of mass spectra representing peptides carrying multiple complex modifications remains challenging, as it is difficult to strike a balance between reasonable execution time, a limited number of false positives, and a huge search space allowing any number of modifications without a priori. The scientific community needs new developments in this area to aid in the discovery of novel post-translational modifications that may play important roles in disease. RESULTS: To make progress on this issue, we implemented SpecGlobX (SpecGlob eXTended to eXperimental spectra), a standalone Java application that quickly determines the best spectral alignments of a (possibly very large) list of Peptide-to-Spectrum Matches (PSMs) provided by any open modification search method, or generated by the user. As input, SpecGlobX reads a file containing spectra in MGF or mzML format and a semicolon-delimited spreadsheet describing the PSMs. SpecGlobX returns the best alignment for each PSM as output, splitting the mass difference between the spectrum and the peptide into one or more shifts while considering the possibility of non-aligned masses (a phenomenon resulting from many situations including neutral losses). SpecGlobX is fast, able to align one million PSMs in about 1.5 min on a standard desktop. Firstly, we remind the foundations of the algorithm and detail how we adapted SpecGlob (the method we previously developed following the same aim, but limited to the interpretation of perfect simulated spectra) to the interpretation of imperfect experimental spectra. Then, we highlight the interest of SpecGlobX as a complementary tool downstream to three open modification search methods on a large simulated spectra dataset. Finally, we ran SpecGlobX on a proteome-wide dataset downloaded from PRIDE to demonstrate that SpecGlobX functions just as well on simulated and experimental spectra. We then carefully analyzed a limited set of interpretations. CONCLUSIONS: SpecGlobX is helpful as a decision support tool, providing keys to interpret peptides carrying complex modifications still poorly considered by current open modification search software. Better alignment of PSMs enhances confidence in the identification of spectra provided by open modification search methods and should improve the interpretation rate of spectra.


Assuntos
Peptídeos , Proteômica , Proteômica/métodos , Bases de Dados de Proteínas , Espectrometria de Massas/métodos , Software , Algoritmos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA